library(htmltools)
APP_URL <- "https://pchen-aitutor.hf.space" # ← your Gradio app URL
USER_ID <- "anonymous_user"
browsable(tagList(
# --- styles (fab, bubble, modal) ---
tags$style(HTML("
#tutor-fab{position:fixed;right:20px;bottom:20px;z-index:99999;padding:12px 14px;border-radius:999px;border:1px solid #ddd;background:#fff;box-shadow:0 8px 24px rgba(0,0,0,.15);cursor:pointer;font-weight:600}
#tutor-fab:hover{background:#f6f8ff}
#tutor-sel-btn{position:absolute;display:none;z-index:100000;padding:6px 10px;font-size:12px;border:1px solid #e5e7eb;border-radius:10px;background:#111827;color:#f9fafb;box-shadow:0 8px 24px rgba(0,0,0,.2);cursor:pointer;user-select:none}
#tutor-modal-overlay{display:none;position:fixed;z-index:99998;inset:0;background:rgba(0,0,0,.35)}
#tutor-modal{position:absolute;top:5%;left:50%;transform:translateX(-50%);width:min(980px,95vw);height:min(720px,90vh);background:#fff;border-radius:10px;box-shadow:0 10px 30px rgba(0,0,0,.25);overflow:hidden}
#tutor-modal-header{display:flex;justify-content:space-between;align-items:center;padding:10px 14px;background:#f5f5f7;border-bottom:1px solid #e5e5ea;font-weight:600}
#tutor-close-btn{cursor:pointer;border:none;background:transparent;font-size:18px}
#tutor-iframe{width:100%;height:calc(100% - 18px);border:0}
")),
# modal + iframe
tags$div(id="tutor-modal-overlay",
tags$div(id="tutor-modal",
tags$div(id="tutor-modal-header","AI Calculus Tutor", tags$button(id="tutor-close-btn","✕")),
tags$iframe(id="tutor-iframe", src="")
)
),
# floating button + selection bubble
tags$button(id="tutor-fab","Ask Tutor"),
tags$button(id="tutor-sel-btn","Ask Tutor"),
# script
tags$script(HTML(sprintf("
(function(){
const APP_URL = '%s';
const USER_ID = '%s';
function sel(){ try { return (window.getSelection?window.getSelection().toString():'').trim(); } catch(e){ return ''; } }
function openInline(q, action){
const p = new URLSearchParams({ user_id: USER_ID, q: q||'', action: action||'prefill' });
const frame = document.getElementById('tutor-iframe');
frame.src = APP_URL + '/?' + p.toString(); // Prefill handled by app from URL
document.getElementById('tutor-modal-overlay').style.display='block';
}
function openTab(q, action){
const p = new URLSearchParams({ user_id: USER_ID, q: q||'', action: action||'prefill' });
window.open(APP_URL + '/?' + p.toString(), 'ai_tutor_tab', 'noopener');
}
// floating button: prefill by default; Alt=send; Ctrl/Cmd=new tab
document.getElementById('tutor-fab').addEventListener('click', function(ev){
const q = sel();
const act = ev.altKey ? 'send' : 'prefill';
if (ev.ctrlKey || ev.metaKey) openTab(q, act); else openInline(q, act);
});
// selection bubble near highlighted text
const b = document.getElementById('tutor-sel-btn');
let last='';
function place(x,y){ b.style.left=(x+10)+'px'; b.style.top=(y+10)+'px'; }
function show(){ b.style.display='block'; } function hide(){ b.style.display='none'; }
document.addEventListener('mouseup', e => setTimeout(() => {
const t = sel(); if (t && t.length>=2 && t.length<=800){ last=t; place(e.pageX,e.pageY); show(); } else hide();
}, 20));
window.addEventListener('scroll', hide, {passive:true}); window.addEventListener('resize', hide);
// bubble click: prefill by default; Alt=send; Ctrl/Cmd=new tab
b.addEventListener('click', function(ev){
if (!last) return;
const act = ev.altKey ? 'send' : 'prefill';
if (ev.ctrlKey || ev.metaKey) openTab(last, act); else openInline(last, act);
hide();
});
// close modal
const overlay = document.getElementById('tutor-modal-overlay');
document.getElementById('tutor-close-btn').addEventListener('click', () => overlay.style.display='none');
overlay.addEventListener('click', e => { if (e.target===overlay) overlay.style.display='none'; });
})();
", APP_URL, USER_ID)))
))By the end of this section, students will be able to:
R skills
optimize()optim()solnp()Unconstrained optimization seeks to find the maximum or minimum of a function without any restrictions on the variables. This forms the foundation for understanding more complex constrained optimization problems.
The following interactive diagram is designed to guess the location of the top or the bottom of a function. The Secant line is used to predict how the function may proceed.
The objective is to stop right at the top or bottom of the function. You can choose different functions from the drop down manue to run the animation.
The animation shows:
For a function \(f: \mathbb{R}\to \mathbb{R}\), we want find the minimum or the maximum of the function \(f(x).\) Since a maximisation problem can always ne formulated a minimisation problem of its negation \(\max_{x\in\mathbb{R}}f(x) \Rightarrow \min_{x\in\mathbb{R}}-f(x)\), we will focus on the minimisation problems.
\[\min_{x \in \mathbb{R}} f(x)\]
First-Order Condition: For a function \(f(x)\) to have a local extremum at \(x = a\), it is necessary that: \[f'(a) = 0\]
Second-Order Condition: For a critical point \(x = a\) where \(f'(a) = 0\):
TechStart Inc. wants to maximize profit from their new software product. Market research shows demand follows \(D(p) = 1000 - 10p\) and production costs are \(C(q) = 5000 + 20q\). The profit function is:
\[\pi(p) = p \cdot D(p) - C(D(p)) = p(1000 - 10p) - (5000 + 20(1000 - 10p))\] \[\pi(p) = 1000p - 10p^2 - 25000 + 200p = 1200p - 10p^2 - 25000\]
Step 1: First-Order Condition \[\pi'(p) = 1200 - 20p = 0\] \[p^* = 60\]
Step 2: Second-Order Condition \[\pi''(p) = -20 < 0\]
Since \(\pi''(60) < 0\), \(p = 60\) gives a local maximum profit. Since the quadratic function has only one critical point, it is also the global maximum.
Step 3: Calculate Maximum Profit \[\pi(60) = 1200(60) - 10(60)^2 - 25000 = 72000 - 36000 - 25000 = 11000\]
profit <- function(p) {
1200*p - 10*p^2 - 25000
}
optimize(profit, interval = c(0, 200), maximum = TRUE)## $maximum
## [1] 60
##
## $objective
## [1] 11000
A firm sells a premium product. Market demand decreases nonlinearly with price: \[q(p) = 1500\, e^{-0.04p},\] where \(p \ge 0\) is the price (in dollars) and \(q(p)\) is quantity demanded.
The firm’s total cost consists of a fixed overhead and a per-unit cost proportional to output: \[C(p) = 3000 \;+\; 18\, q(p).\] Revenue at price \(p\) is \(R(p) = p \cdot q(p).\)
Find the profit maximising quantity \(q\).
Complete the task using optimize() in following R
Environment or in your local RStudio.
For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum.
\[\min_{x \in \mathbb{R}^n} f(x)\]
First-Order Condition: For a function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\), it is necessary that: \[\nabla f(x_0) = 0\]
Second-Order Condition: For a critical point \(x = x_0\) where \(\nabla f(x_0) = 0\): \[H = \nabla^2 f(x_0)\]
For \(n=2\), i.e. \(f\) is a function of two variables \(f(x,y)\) we have
\[\nabla f(x) = \begin{pmatrix} f_{x} \\ f_{x} \end{pmatrix}\]
The second-order derivatives define the Hessian matrix: \[H = \begin{pmatrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \end{pmatrix}\]
Sarah owns a coffee shop that sells two main products: coffee and pastries. Her daily revenue depends on both the number of cups of coffee sold \(x\) and the number of pastries sold \(y\). Through market research, she has determined that her revenue function is
\[R(x, y) = 120x + 100y - 2x² - y² - xy\]
How can Sarah maximise her revenue?
Step 1: Find Critical Points by FOC
\(\nabla f =\begin{pmatrix} \frac{\partial R}{\partial x} \\ \frac{\partial R}{\partial y}\end{pmatrix}=\begin{pmatrix} 120-4x-y \\100 -2y -x \end{pmatrix}=\begin{pmatrix} 0 \\ 0\end{pmatrix}\)
Rearrange in matrix form \[\begin{pmatrix} 4 & 1 \\ 1 & 2 \end{pmatrix}\begin{pmatrix} x \\y\end{pmatrix}=\begin{pmatrix} 120 \\100\end{pmatrix} \]
Solve the two equation system
## [,1]
## [1,] 20
## [2,] 40
Step 2: Compute eigenvalues of the Hessian matrix
\[H = \begin{pmatrix} -4 & -1 \\ -1 & -2 \end{pmatrix}\]
## [1] -1.585786 -4.414214
Step 3: Apply Second-Order Test
Since eigen(H)$values < 0, hence the Hessian matrix \(H\) is negative definite, the critical point (20, 40) is a local maximum. Since there is only one critical point, the local maximum is the global maximum.
Step 4: Calculate Maximum Profit \[R(20, 40) = 120(20) + 100(40) - 2(20)² - (40)² - (20)(40) = 3200\]
# Define the negative profit function (because optim minimizes by default)
profit <- function(v) {
x <- v[1]
y <- v[2]
- (120*x + 100*y - 2*x^2 - y^2 - x*y) # Negative for maximization
}
# Use optim to maximize profit
optim(par = c(0, 0), fn = profit)## $par
## [1] 20 40
##
## $value
## [1] -3200
##
## $counts
## function gradient
## 163 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
library(plotly)
A<- matrix(c(-4,-1,-1,-2),2,2,byrow=TRUE)
# Create a grid over [-10, 10]
x <- seq(-0.3, 0.3,0.01 )
y <- seq(-0.3, 0.3,0.01)
z <- outer(x, y, function(x, y) {
mapply(function(x, y) {
X <- c(x, y)
t(X) %*% A %*% X
}, x, y)
})
plot_ly(x = x, y = y, z = z, type = "surface") |>
layout(title = "Interactive Surface: z = xᵀAx",
scene = list(
xaxis = list(title = "dx"),
yaxis = list(title = "dy"),
zaxis = list(title = "dz")))A company produces three products: \(x\), \(y\), and \(z\).
The total revenue (in thousands of dollars) is modeled by: \[f(x,y,z) = 40x+50y+6-z-2x^2-2.5y^2-3z^2\]
where: - \(x, y, z \ge 0\) are the
quantities of the three products (in hundreds of units).
Task:
Find the production levels \((x, y,
z)\) that maximize revenue.
Instructions:
-1. Write the gradient vector \(\nabla R(x,y,z)\) by computing the partial derivatives with respect to \(x\), \(y\), and \(z\).
-2. Set \(\nabla R(x,y,z) = 0\) and solve for the critical point(s).
-3. Form the Hessian matrix and evaluate it at each critical point.
-4. Use the second-order test for multivariable functions to determine whether each point is a local maximum, local minimum, or saddle point.
-5. Interpret the optimal production quantities and the maximum revenue.
Solve the problem numerically by optim() to confirm your
analytic solution. You can complete the task in following R Environment
or in your local RStudio.
Remarks: You can study the options of
optim() by put “?optim” into R to increase the precision of
the solution.
GreenInvest, an ESG-focused investment firm, faces a portfolio optimization problem involving the allocation of $100 million across three green tech companies: SolarTech, EcoTransport, and CleanWater. Each offers different returns and risks, requiring the firm to maximize expected returns while strictly adhering to the constraint that total investment equals exactly $100 million.
This constraint transforms the problem from unconstrained optimization to a constrained optimization task. The Lagrange multiplier method is the standard method used to solve constrained optimization problems.
Now we want to motivate the Lagrange multiplier method. Constrained optimization can be seen as a problem of finding the highest point along a mountain road, where the height of the mountain is the value of the objective. We want to go as high as possible. The road presents the restriction: we are not allowed to leave the road.
The contour lines of the mountain give the same heights on the mountain respectively. Whenever the road crosses these contour lines it rises higher and higher.
When the road becomes a tangent of a contour line and does not cut any higher contour lines, it reaches the highest point because on the both side of the tangent point the road goes away from the contour line implying the road goes downwards in both directions and the tangent point is the highest point on the road.
Let us use a general contour plot with a reference line to formalize the idea described above. As the red line moves from the lower right corner toward the upper left, it crosses contour lines with increasing values, meaning the surface rises along the line. At point A, it reaches the highest point with a value of 10. After that, as it crosses contour lines with decreasing values, the surface descends.
Now if we cut a slice at z = 10, we have the contour line of \(f\) (the yellow ellipse) and the contour of \(g\) the red line touch each other a A, they share the same tangent line. Since the gradient \(\nabla f\) and \(\nabla g\) are perpendicular to the shared tangent line. They must parallel to each other. This leads to
\[\nabla f = \lambda \nabla g,\] where $$ is a scalar number. Rearrange the equation above we have: \[\nabla f - \lambda \nabla g = 0.\] This can be seen as the FOC applied on the Lagrange function: \[ \mathcal L(x,\lambda) = f(x)-\lambda g(x)\] The FOC of the Lagrange function \(\nabla \mathcal L(x,\lambda) = 0\) leads to \[ \nabla f(x) - \lambda \nabla g(x)=0, \hspace{1cm} g(x)=0.\].
The following 3D diagram shows the objective surface \(f(x,y)\), the constraint plane \(g(x,y)\), the contour lines of \(f\) and \(g\) and the shared tangent, as well the the gradients of \(f\) and \(g\) at the tangent point.
The argument above holds also for nonlinear constraints. (see the following graph)
Now we are in position to formulate the Lagrange approach to solve constrained optimisation problems.
For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum under \(m\) constraints \(h_i(x)=0\) for \(i = 1,2,...,m\) where each constraint is a multivariable function \(h_i:\mathbb{R}^n \to \mathbb{R}\).
\[\min_{x \in \mathbb{R}^n} f(x)\] \[\text{s.t.} \hspace{0.5cm} h_i(x) = 0, \hspace{0.5cm} i = 1, \dots, m\]
Definition 2.1 (Lagrange Function)
\[\mathcal{L}(x, \lambda) = f(x) + \sum_{i=1}^m \lambda_i g_i(x)\] The multivariable function \(\mathcal{L}(x, \lambda)\) is called Lagrange function.
Definition 2.2 (Lagrange First-Order Condition): For the function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\) under the constraints \(h_i(x_0)=0\), it is necessary that:
\[ \nabla_x \mathcal{L}(x, \lambda) = \nabla f(x) + \sum_{i=1}^m \lambda_i \nabla g_i(x) = 0, \quad g_i(x) = 0 \quad \text{for all } i\]
Definition 2.3 (Lagrange Second-Order Condition): A critical point \(x = x_0\) is a local minimum if the following condition holds
For all nonzero vectors \(d \in \mathbb{R}^n\) such that
\[\nabla g_i(x^*)^\top d = 0 \text{ for all } i = 1, \dots, m, \\[0.5em]\]
\[d^\top \nabla^2_{xx} \mathcal{L}(x^*, \lambda) d > 0.\] The Lagrange second order condition is less restrictive than second order condition in the the unconstrained case where positive definite Hessian requires the above inequality must be satisfied by any \(d\). The Lagrange SOC requires only those \(d\) in the direction of the constraints (or perpendicular to the gradient of the constraints.)
The above is a general formulation of optimisation with \(m\) constraints. Often \(m=1\) we have only one constraint.
A firm wants to allocate its $100000 advertising budget between two channels:
The estimated revenue generated from these investments is: \[ R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] The firm must satisfy the total budget constraint: \[ x + y = 100{,}000 \]
We formulate this business problem as an maximisation problem subject to the equality constraint: \[ \max_{x, y} \quad R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] \[ \text{subject to } x + y = 100000 \]
Step 1 We form the Lagrangian: \[ \mathcal{L}(x, y, \lambda) = 200\sqrt{x} + 300\sqrt{y} - \lambda(x + y - 100{,}000) \]
Step 2 First-Order Conditions (FOCs)
\[\begin{align*} \frac{\partial \mathcal{L}}{\partial x} &= \frac{100}{\sqrt{x}} - \lambda = 0 \quad (1) \\ \frac{\partial \mathcal{L}}{\partial y} &= \frac{150}{\sqrt{y}} - \lambda = 0 \quad (2) \\ \frac{\partial \mathcal{L}}{\partial \lambda} &= x + y - 100{,}000 = 0 \quad (3) \end{align*}\]
We have three unknowns \(x,y,\lambda\) and three equations. From equations (1) and (2): \[ \frac{100}{\sqrt{x}} = \frac{150}{\sqrt{y}} \Rightarrow \frac{\sqrt{y}}{\sqrt{x}} = \frac{3}{2} \Rightarrow \frac{y}{x} = \left(\frac{3}{2}\right)^2 = \frac{9}{4} \Rightarrow y = \frac{9}{4}x \]
Substitute into the budget constraint: \[ x + \frac{9}{4}x = 100{,}000 \Rightarrow \frac{13}{4}x = 100{,}000 \Rightarrow x = \frac{400{,}000}{13} \approx 30{,}769.23 \] \[ y = \frac{9}{4}x = \frac{900{,}000}{13} \approx 69{,}230.77 \]
Step 3 Second-Order Condition (SOC)
We verify the SOC using the bordered Hessian approach. Note that the constraint is linear and the second order derivative of linear function is zero. We have \[\nabla^2 \mathcal{L}(x,\lambda)_{x,y} = \nabla^2 R(x,\lambda)_{x,y}+\nabla^2 \lambda(x+y-100000)=\nabla^2 R(x,\lambda)_{x,y}\] We need only to calculate Hessian of \(R(x,y)\) for the SOC.
The second derivatives are: \[ \frac{\partial^2 R}{\partial x^2} = -100x^{-3/2} < 0, \quad \frac{\partial^2 R}{\partial y^2} = -150y^{-3/2} < 0 \]
Hessian matrix: \[ H = \begin{bmatrix} -100x^{-3/2} & 0 \\ 0 & -150y^{-3/2} \end{bmatrix} \]
Constraint gradient: \[ \nabla g = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]
Tangent space vectors satisfy: \[ d = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \] It is to note that to determine whether a matrix is positive definite, we need to consider arbitrary \(d\). Here we only need to consider specific \(d\): same components with opposite signs.
Evaluate the SOC: \[ d^\top \mathcal{L}_{x,y}d = d^\top R_{x,y}d = d^\top H d = -100x^{-3/2} - 150y^{-3/2} < 0 \]
Therefore, the Hessian is negative definite on the tangent space \(\Rightarrow\) local maximum.
Determine the maximum revenue
\[R(x^*,y^*)=200\sqrt{30769.23}+300\sqrt{69230.77}=114017.5\]
library(Rsolnp)
# Objective function to maximize — note: Rsolnp minimizes by default, so we negate it
fn <- function(par) {
x <- par[1]
y <- par[2]
return(-(200 * sqrt(x) + 300 * sqrt(y))) # negative for maximization
}
# Equality constraint function: x + y = 100000
eqfun <- function(par) {
return(sum(par))
}
# Equality bound
eqB <- 100000
# Initial guess
start <- c(50000, 50000)
# Solve with Rsolnp
res <- solnp(pars = start,
fun = fn,
eqfun = eqfun,
eqB = eqB,
LB = c(0, 0)) # lower bounds for x and y##
## Iter: 1 fn: -114017.5423 Pars: 30763.47320 69236.52680
## Iter: 2 fn: -114017.5423 Pars: 30763.47320 69236.52680
## solnp--> Completed in 2 iterations
Hover your mouse over the solution point, and you can see that the constraining plane becomes tangent to the contour line projected onto the (x, y) plane.
Definition 3.1 (KKT Conditions): For the problem: \[\min f(x) \quad \text{subject to} \quad g_i(x) = 0, \quad h_j(x) \leq 0\]
The Karush-Kuhn-Tucker conditions are:
Comments: At the critical point, we can classify the inequality constraints into two groups: the binding group and the non-binding group. For the non-binding group, the inequality constraints do not play any role at the critical point; hence, they drop out of the Lagrangian function by setting the corresponding \(\mu_j = 0\). For the binding group, the inequality constraints become equality constraints, and we solve the problem as if only equality constraints were present, using the Lagrangian approach.
This is reflected in the Complementary slackness \(\mu_j h_j(x^*) = 0\).
Economic Interpretation: The multiplier \(\mu_j\) represents the shadow price of constraint \(j\). If \(\mu_j > 0\), the constraint is binding (active). If \(\mu_j = 0\), the constraint is not binding.
\[ \min_{x,y}\; f(x,y)=(x-5)^2+(y-5)^2 \]
Constraints.
\[
x+y=3,
\] \[
x+2y\ge 5.
\]
First-order conditions(KKT) Write the standard constraints \(h(x,y)=x+y-3=0\) and \(g(x,y)=5-(x+2y)\le 0\).
Lagrangian: \[ \mathcal L(x,y,\lambda,\mu)=(x-5)^2+(y-5)^2+\lambda\,h(x,y)+\mu\,g(x,y),\quad \mu\ge 0. \]
Stationarity: \[ \partial_x\mathcal L=2(x-5)+\lambda-\mu=0,\qquad \partial_y\mathcal L=2(y-5)+\lambda-2\mu=0. \] Feasibility: \(x+y=3,\; x+2y\ge 5\).
Dual feas./slackness: \(\mu\ge 0,\; \mu\,g(x,y)=0\).
Because the unconstrained projection onto \(x+y=3\) is \((1.5,1.5)\) and violates \(x+2y\ge 5\), the inequality is active at optimum: \(x+2y=5\). Solving \[ x+y=3,\quad x+2y=5 \;\Rightarrow\; (x^*,y^*)=(1,2). \] Multipliers from stationarity: \[ \lambda-\mu=8,\quad \lambda-2\mu=6 \;\Rightarrow\; (\lambda^*,\mu^*)=(10,2). \]
Second-order condition (SOC)
\(\nabla^2 f=2I\succ 0\) (strictly convex objective) and the constraints are affine, so the KKT point is the unique global minimizer.
# install.packages("Rsolnp") # if needed
library(Rsolnp)
# Objective
fn <- function(w){
x <- w[1]; y <- w[2]
(x - 5)^2 + (y - 5)^2
}
# Equality: x + y = 3
eqf <- function(w){
w[1] + w[2]
}
# Inequality: x + 2y >= 5 -> return x+2y with lower bound 5
ineqf <- function(w){
w[1] + 2*w[2]
}
ini <- c(0, 0)
res <- solnp(
pars = ini,
fun = fn,
eqfun = eqf, eqB = 3,
ineqfun = ineqf, ineqLB = 5, ineqUB = 1e6,
LB = c(-1e6, -1e6), UB = c(1e6, 1e6)
)##
## Iter: 1 fn: 25.0000 Pars: 1.00000 2.00000
## Iter: 2 fn: 25.0000 Pars: 1.00000 2.00000
## solnp--> Completed in 2 iterations
## [1] 1 2
## [1] 50 25 25
## --- KKT multipliers from stationarity ---
x <- res$pars[1]; y <- res$pars[2]
# Stationarity system:
# 2(x-5) + lambda - mu = 0
# 2(y-5) + lambda - 2*mu = 0
A <- rbind(c(1, -1), c(1, -2))
b <- c(-2*(x - 5), -2*(y - 5))
sol <- solve(A, b)
lambda <- sol[1]; mu <- sol[2]
list(x = x, y = y, lambda = lambda, mu = mu,
eq_ok = abs(x + y - 3) < 1e-8,
ineq_val = x + 2*y, ineq_active = abs(x + 2*y - 5) < 1e-8)## $x
## [1] 1
##
## $y
## [1] 2
##
## $lambda
## [1] 10
##
## $mu
## [1] 2
##
## $eq_ok
## [1] TRUE
##
## $ineq_val
## [1] 5
##
## $ineq_active
## [1] FALSE
The interactive graph below shows the objective surface with its unconstrained minimum marked by the yellow point. The green plane represents the equality constraint, and the dark violet triangular prism represents the inequality constraint. The solution point is located where the objective surface, the green plane, and the boundary of the violet prism all meet. Here, the inequality constraint is binding.
The following contour plot with equality constraint (the yellow line) and inequality constraint (light yellow area) and the solution point demonstrate an alternative view of the problem.
A manufacturer produces two standard products, \(x\) and \(y\), and one premium product, \(z\).
The profit function (in thousands of dollars) is given by: \[
\pi(x,y,z) = 40x^{0.5}y^{0.3}z^{0.2} - 0.4x^2 - 0.3y^2 - 0.5z^2
\] where \(x, y, z \ge 0\) are
production quantities (in hundreds of units).
The firm faces the following constraints:
Resource constraint (equality)
The total labor used by all products must equal 2400 hours: \[
4x + 6y + 8z = 2400
\] where the coefficients represent labor hours per hundred
units.
Market constraint (inequality)
The premium product’s output cannot exceed 60% of the total production:
\[
z \le 0.6(x + y + z)
\]
Task: 1. Write down the Lagrangian function for the problem, introducing a Lagrange multiplier \(\lambda\) for the equality constraint and \(\mu\) for the inequality constraint. 2. State the Karush–Kuhn–Tucker (KKT) conditions. 3. Solve the first-order conditions along with the constraints to find the optimal \((x^*, y^*, z^*)\), the associated multipliers, and the maximum profit. 4. Interpret the economic meaning of the Lagrange multipliers \(\lambda\) and \(\mu\).
Hint: Remember that for the inequality constraint \(g(x,y,z) \le 0\), the KKT complementary slackness condition requires \(\mu \cdot g(x,y,z) = 0\) and \(\mu \ge 0\).
Solve the problem by solnp() in following R Environment
or in your local RStudio.
Real business optimization problems often combine multiple concepts: unconstrained optimization for initial analysis, equality constraints for budget limitations, and inequality constraints for practical restrictions.
A firm chooses production levels of two products, \(x\) and \(y\). Profit is given by
\[ \Pi(x,y) = 120x + 100y - 2x^2 - y^2 - xy . \]
This is an unconstrained maximisation problem.
\[ \frac{\partial \Pi}{\partial x} = 120 - 4x - y = 0, \quad \frac{\partial \Pi}{\partial y} = 100 - x - 2y = 0 \]
Solving:
\[ x^* = 20, \quad y^* = 40 \]
The Hessian is
\[ H = \begin{bmatrix} -4 & -1 \\ -1 & -2 \end{bmatrix} \]
\(\det(H) = 7 > 0\) and \(-4 < 0\) ⇒ \(H\) is negative definite ⇒ local (and here global) maximum.
# Profit function
Pi <- function(v) {
x <- v[1]; y <- v[2]
120*x + 100*y - 2*x^2 - y^2 - x*y
}
# Negate because optim() minimizes
negPi <- function(v) -Pi(v)
# Gradient of -Pi
grad_negPi <- function(v) {
x <- v[1]; y <- v[2]
gx <- -(120 - 4*x - y)
gy <- -(100 - x - 2*y)
c(gx, gy)
}
start <- c(0, 0)
fit <- optim(
par = start,
fn = negPi,
gr = grad_negPi,
method = "BFGS",
control = list(reltol = 1e-12)
)
fit$par # x*, y*## [1] 20.00000 39.99998
## [1] -3200
Contour Plot
Interactive 3D Surface Plot
We form a portfolio of two ASX stocks: CBA.AX and BHP.AX. Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget constraint \(w_1 + w_2 = 1\). We maximize mean–variance utility with risk aversion \(\gamma>0\): \[ \max_{w_1,w_2}\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\Big[\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\Big]\]
\[ \text{s.t. } w_1+w_2=1.\]
Here \(\hat{\mu}_i\) are sample mean returns, and \[ \hat{\Sigma}= \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\ \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix} \quad\text{is the sample covariance matrix.} \]
We form a portfolio of two ASX stocks: CBA.AX
(Commonwealth Bank) and BHP.AX (BHP Group).
Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget
constraint \(w_1 + w_2 =
1\).
We will estimate monthly log-return means and covariances from the last 5 years of data, then solve a mean–variance utility maximisation problem.
Let \(P_{i,t}\) be the adjusted closing price of asset \(i \in \{1,2\}\) at month \(t\) (with \(i=1\) for CBA, \(i=2\) for BHP). Define monthly log returns \[ r_{i t} \;=\; \ln\!\left(\frac{P_{i,t}}{P_{i,t-1}}\right), \qquad t=1,\dots,T. \]
Sample mean (monthly) \[ \hat{\mu} = \begin{bmatrix} \hat{\mu}_1\\[2pt] \hat{\mu}_2 \end{bmatrix} = \begin{bmatrix} \frac{1}{T}\sum_{t=1}^T r_{1t}\\[6pt] \frac{1}{T}\sum_{t=1}^T r_{2t} \end{bmatrix}. \]
Sample covariance (monthly) \[ \hat{\Sigma} = \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt] \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix}, \quad \hat{\sigma}_1^2=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)^2,\; \hat{\sigma}_2^2=\frac{1}{T-1}\sum_{t=1}^T (r_{2t}-\hat{\mu}_2)^2,\; \hat{\sigma}_{12}=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)(r_{2t}-\hat{\mu}_2). \]
(Optional annualisation: multiply means by \(12\) and standard deviations by \(\sqrt{12}\).)
Given risk aversion \(\gamma>0\), choose \((w_1,w_2)\) to maximise mean–variance utility: \[ \max_{w_1,w_2}\;\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 \;-\;\frac{\gamma}{2}\left( \hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2 \right) \quad\text{s.t.}\quad w_1 + w_2 = 1. \]
(No-shorting \(0\le w_i\le1\) can be added later; here we keep only the equality constraint.)
Lagrangian \[ \mathcal{L}(w_1,w_2,\lambda) = \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\!\left(\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\right) +\lambda(1-w_1-w_2). \]
First-order conditions (FOC) \[ \frac{\partial \mathcal{L}}{\partial w_1}:\quad \hat{\mu}_1 - \gamma(\hat{\sigma}_1^2 w_1 + \hat{\sigma}_{12} w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial w_2}:\quad \hat{\mu}_2 - \gamma(\hat{\sigma}_{12} w_1 + \hat{\sigma}_2^2 w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial \lambda}:\quad w_1 + w_2 = 1. \]
Second-order condition (SOC)
The Hessian with respect to \((w_1,w_2)\) is \[
H
=
-\gamma
\begin{bmatrix}
\hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt]
\hat{\sigma}_{12} & \hat{\sigma}_2^2
\end{bmatrix}.
\] Since \(\hat{\Sigma}\succ 0\)
and \(\gamma>0\), we have \(H\prec 0\) (negative definite), so the FOC
point is a unique global maximiser.
Subtract the two FOCs to eliminate \(\lambda\):
Substitute \(w_2 = 1 - w_1\) and solve for \(w_1\): \[ \boxed{ w_1^* = \frac{(\hat{\mu}_1 - \hat{\mu}_2)\;+\;\gamma(\hat{\sigma}_2^2 - \hat{\sigma}_{12})} {\gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big)} }, \qquad \boxed{\,w_2^* = 1 - w_1^*\, .} \]
\[ \boxed{ \lambda = \frac{ \big(\hat{\sigma}_2^2 - \hat{\sigma}_{12}\big)\hat{\mu}_1 \;+\; \big(\hat{\sigma}_1^2 - \hat{\sigma}_{12}\big)\hat{\mu}_2 \;-\; \gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big) }{ \hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12} } } \]
The denominator equals \(\mathrm{Var}(r_{1}-r_{2})=\hat{\sigma}_1^2+\hat{\sigma}_2^2-2\hat{\sigma}_{12}\), which is positive unless the assets are perfectly linearly dependent.
R code for empirical data extraction
# install.packages(c("tidyquant","dplyr","tidyr","lubridate"), dependencies = TRUE)
library(tidyquant)
library(dplyr)
library(tidyr)
library(lubridate)
symbols <- c("CBA.AX", "BHP.AX")
start <- floor_date(Sys.Date() - years(5), unit = "month")
end <- Sys.Date()
# 1) Download daily prices, then compute monthly log returns from adjusted prices
prices <- tq_get(symbols, from = start, to = end, get = "stock.prices")
monthly_ret <- prices %>%
group_by(symbol) %>%
tq_transmute(
select = adjusted,
mutate_fun = periodReturn,
period = "monthly",
type = "log",
col_rename = "ret"
) %>%
ungroup()
# 2) Wide matrix of returns: rows = months, columns = symbols
R <- monthly_ret %>%
select(symbol, date, ret) %>%
pivot_wider(names_from = symbol, values_from = ret) %>%
arrange(date) %>%
drop_na()
# 3) Empirical moments (monthly log returns)
r_mat <- as.matrix(R[, symbols])
mu <- colMeans(r_mat) # vector c(mu_CBA, mu_BHP)
Sigma <- cov(r_mat) # 2x2 covariance matrix
# Print results
mu## CBA.AX BHP.AX
## 0.01786453 0.01022144
## CBA.AX BHP.AX
## CBA.AX 3.592835e-03 9.802353e-05
## BHP.AX 9.802353e-05 4.819121e-03
R code for constrained optimization
library(Rsolnp)
# Three-asset portfolio optimization
mu <- c(0.08, 0.12) # Expected returns
Sigma <- matrix(c(0.04, 0.01,
0.01, 0.09), nrow = 2) # Covariance matrix
Omega <- Sigma
R<-mu
n=2
w = c(1/n,n)
fn1=function(w)
{
-t(R)%*%w + t(w)%*%Omega%*%w ### for gamma = 2
}
### the constraint that the weights sum to 1
eqn1=function(w){
return(sum(w))
}
### the initial value for numerical procedure
ini = c(0.5,0.5)
solution=solnp(ini, fun = fn1, eqfun = eqn1, eqB = c(1),LB=rep(0,n),UB=rep(1,n))##
## Iter: 1 fn: -0.06273 Pars: 0.54545 0.45455
## Iter: 2 fn: -0.06273 Pars: 0.54545 0.45455
## solnp--> Completed in 2 iterations
## [1] 0.5454546 0.4545454
## [,1]
## [1,] -0.06272727
## [1] 1
R provides several powerful tools for solving optimization problems. Understanding when and how to use each tool is crucial for practical problem-solving.
optim() - General-purpose optimizationconstrOptim() - Constrained optimizationlpSolve::lp() - Linear programmingquadprog::solve.QP() - Quadratic programming| Function / Package | Best For | Constraints Supported | Advantages | Limitations |
|---|---|---|---|---|
optim() |
Unconstrained problems or simple box constraints | None, box ("L-BFGS-B") |
Flexible, built-in, multiple methods (BFGS,
CG, Nelder-Mead, etc.) |
No general equality/inequality constraints |
constrOptim() |
Problems with linear inequality constraints | Linear inequality | Direct constraint handling, built into R | No equality constraints, slower for large problems |
lpSolve::lp() |
Linear objective with linear constraints | Linear equality & inequality | Very fast, guaranteed global optimum | Only linear problems |
quadprog::solve.QP() |
Quadratic objective with linear constraints | Linear equality & inequality | Efficient for mean–variance portfolio optimisation | Requires positive-definite quadratic term, linear constraints only |
Rsolnp::solnp() |
Nonlinear objective with equality/inequality constraints | Nonlinear equality & inequality + bounds | Handles general nonlinear constraints; supports bounds | Needs good starting values; may converge to local optima |
nloptr::nloptr() |
Large/complex nonlinear optimisation | Nonlinear equality & inequality + bounds | Many algorithms (global & local), gradient-free or gradient-based | Algorithm choice/tuning is critical |
DEoptim::DEoptim() |
Global optimisation, non-smooth or multi-modal problems | Bounds only | Robust to local minima; no gradient required | Slower for smooth problems; needs tuning population size & iterations |
GenSA::GenSA() |
Global optimisation over complex search spaces | Bounds only | Works for discrete or rugged landscapes; automatic annealing schedule | Can be slow; stochastic results vary |
optimx::optimx() |
Comparing multiple local optimisation methods | Same as optim() methods used |
Simple interface to try several methods and compare results | Constraint support limited to methods used |
This section provides review and application questions designed to assess understanding at five different levels. Each level builds upon the previous one, fostering deeper comprehension and the ability to apply concepts in increasingly complex situations.
https://pchen-exercisesfeedbackselect.hf.space
Optimization is fundamental to business decision-making: -
Strategic Planning: Resource allocation and capacity
planning - Financial Management: Portfolio optimization
and risk management
- Operations: Production scheduling and supply chain
optimization - Marketing: Pricing strategies and budget
allocation
The progression from unconstrained to constrained optimization mirrors the evolution from theoretical models to practical business applications. Understanding both the mathematical theory and computational implementation enables effective problem-solving in complex business environments.
This concludes Section 5: Optimization. Students should now be equipped to tackle real-world optimization problems using both analytical and computational approaches.